Skip to content

Conversation

mizvekov
Copy link
Contributor

@mizvekov mizvekov commented Jul 9, 2025

This is a major change on how we represent nested name qualifications in the AST.

  • The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy.
  • The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present.
  • TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size.

This patch offers a great performance benefit.

It greatly improves compilation time for stdexec. For one datapoint, for test_on2.cpp in that project, which is the slowest compiling test, this patch improves -c compilation time by about 7.2%, with the -fsyntax-only improvement being at ~12%.

This has great results on compile-time-tracker as well:
image

This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands.

It has some other miscelaneous drive-by fixes.

About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact.

There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work.

How to review: The important changes are all in clang/include/clang/AST and clang/lib/AST, with also important changes in clang/lib/Sema/TreeTransform.h.

The rest and bulk of the changes are mostly consequences of the changes in API.

PS: TagType::getDecl is renamed to getOriginalDecl in this patch, just for easier to rebasing. I plan to rename it back after this lands.

Fixes #136624
Fixes #43179
Fixes #68670
Fixes #92757

@shafik
Copy link
Collaborator

shafik commented Sep 3, 2025

This feels like a regression, unless you think the bug is elsewhere and the assert just catches it now: #156458 (comment)

mizvekov added a commit that referenced this pull request Sep 3, 2025
This fixes a regression which was introduced in #147835.

Since this regression was never released, there are no release notes.

Fixes #156458
mizvekov added a commit that referenced this pull request Sep 4, 2025
This fixes a regression which was introduced in #147835.

Since this regression was never released, there are no release notes.

Fixes #156458
@katzdm
Copy link
Contributor

katzdm commented Sep 5, 2025

"And on that day, every maintainer of a nontrivial clang fork cried out to the sky in unison."

ahatanaka added a commit to swiftlang/llvm-project that referenced this pull request Sep 5, 2025
The tests started failing after the following PR was merged:
llvm#147835

rdar://159140646
ahatanaka added a commit to swiftlang/llvm-project that referenced this pull request Sep 5, 2025
The tests started failing after the following PR was merged:
llvm#147835

rdar://159140646
@joanahalili
Copy link

joanahalili commented Sep 10, 2025

@mizvekov there is one more instance of a clang crash that we are encountering during our testing at google. Here is the reproducer https://godbolt.org/z/jfeW9d9Ts. Please take a look.

@alexfh
Copy link
Contributor

alexfh commented Sep 10, 2025

For convenience copying the code and clang stdout here:

class A;
using T = __attribute__((swiftasynccall)) void(
    __attribute__((swift_async_context)) A *);
T R;
class A;
__attribute__((swiftasynccall)) void
fa(__attribute__((swift_async_context))
                                  A *) {
  R == fa;
}
clang++: /root/llvm-project/llvm/tools/clang/lib/AST/ASTContext.cpp:14198: clang::QualType getCommonNonSugarTypeNode(const clang::ASTContext&, const clang::Type*, clang::Qualifiers&, const clang::Type*, clang::Qualifiers&): Assertion `EPIX.ExtParameterInfos == EPIY.ExtParameterInfos' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /opt/compiler-explorer/clang-assertions-trunk/bin/clang++ -g -o /app/output.s -mllvm --x86-asm-syntax=intel -fno-verbose-asm -S --gcc-toolchain=/opt/compiler-explorer/gcc-snapshot -fcolor-diagnostics -fno-crash-diagnostics -std=gnu++20 <source>
1.	<source>:9:10: current parser token ';'
2.	<source>:8:40: parsing function body 'fa'
3.	<source>:8:40: in compound statement ('{}')
 #0 0x0000000004083a38 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4083a38)
 #1 0x0000000004080e64 llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4080e64)
 #2 0x0000000003fc51e8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007af017a42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00007af017a969fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc)
 #5 0x00007af017a42476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
 #6 0x00007af017a287f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
 #7 0x00007af017a2871b (/lib/x86_64-linux-gnu/libc.so.6+0x2871b)
 #8 0x00007af017a39e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
 #9 0x00000000076059d8 getCommonNonSugarTypeNode(clang::ASTContext const&, clang::Type const*, clang::Qualifiers&, clang::Type const*, clang::Qualifiers&) ASTContext.cpp:0:0
#10 0x0000000007605f3b clang::ASTContext::getCommonSugaredType(clang::QualType, clang::QualType, bool) const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x7605f3b)
#11 0x0000000006d4390d clang::Sema::FindCompositePointerType(clang::SourceLocation, clang::Expr*&, clang::Expr*&, bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6d4390d)
#12 0x0000000006c38cd1 convertPointersToCompositeType(clang::Sema&, clang::SourceLocation, clang::ActionResult<clang::Expr*, true>&, clang::ActionResult<clang::Expr*, true>&) SemaExpr.cpp:0:0
#13 0x0000000006c6c125 clang::Sema::CheckCompareOperands(clang::ActionResult<clang::Expr*, true>&, clang::ActionResult<clang::Expr*, true>&, clang::SourceLocation, clang::BinaryOperatorKind) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6c6c125)
#14 0x0000000006c7454d clang::Sema::CreateBuiltinBinOp(clang::SourceLocation, clang::BinaryOperatorKind, clang::Expr*, clang::Expr*, bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6c7454d)
#15 0x0000000006c75f98 clang::Sema::BuildBinOp(clang::Scope*, clang::SourceLocation, clang::BinaryOperatorKind, clang::Expr*, clang::Expr*, bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6c75f98)
#16 0x0000000006c76606 clang::Sema::ActOnBinOp(clang::Scope*, clang::SourceLocation, clang::tok::TokenKind, clang::Expr*, clang::Expr*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6c76606)
#17 0x00000000067af1da clang::Parser::ParseRHSOfBinaryExpression(clang::ActionResult<clang::Expr*, true>, clang::prec::Level) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x67af1da)
#18 0x00000000067b27f9 clang::Parser::ParseExpression(clang::TypoCorrectionTypeBehavior) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x67b27f9)
#19 0x00000000068363f9 clang::Parser::ParseExprStatement(clang::Parser::ParsedStmtContext) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x68363f9)
#20 0x000000000682d784 clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt*, 24u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*, clang::ParsedAttributes&, clang::ParsedAttributes&, clang::LabelDecl*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x682d784)
#21 0x000000000682e5f9 clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt*, 24u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*, clang::LabelDecl*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x682e5f9)
#22 0x0000000006836b0e clang::Parser::ParseCompoundStatementBody(bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6836b0e)
#23 0x000000000683732a clang::Parser::ParseFunctionStatementBody(clang::Decl*, clang::Parser::ParseScope&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x683732a)
#24 0x000000000673e82b clang::Parser::ParseFunctionDefinition(clang::ParsingDeclarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::LateParsedAttrList*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x673e82b)
#25 0x0000000006774c8d clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::ParsedAttributes&, clang::Parser::ParsedTemplateInfo&, clang::SourceLocation*, clang::Parser::ForRangeInit*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6774c8d)
#26 0x00000000067315fe clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x67315fe)
#27 0x0000000006731d9f clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6731d9f)
#28 0x0000000006739bda clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x6739bda)
#29 0x000000000673ab75 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x673ab75)
#30 0x000000000671b5ea clang::ParseAST(clang::Sema&, bool, bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x671b5ea)
#31 0x00000000049bbfa8 clang::CodeGenAction::ExecuteAction() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x49bbfa8)
#32 0x0000000004cb65c5 clang::FrontendAction::Execute() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4cb65c5)
#33 0x0000000004c31b8e clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4c31b8e)
#34 0x0000000004dabd81 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x4dabd81)
#35 0x0000000000db57bf cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0xdb57bf)
#36 0x0000000000dac46a ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#37 0x0000000004a25019 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#38 0x0000000003fc5684 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x3fc5684)
#39 0x0000000004a2562f clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#40 0x00000000049e619d clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x49e619d)
#41 0x00000000049e722e clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x49e722e)
#42 0x00000000049ef1c5 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0x49ef1c5)
#43 0x0000000000db1c75 clang_main(int, char**, llvm::ToolContext const&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0xdb1c75)
#44 0x0000000000c647d4 main (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0xc647d4)
#45 0x00007af017a29d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#46 0x00007af017a29e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
#47 0x0000000000dabf15 _start (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+++0xdabf15)

@alexfh
Copy link
Contributor

alexfh commented Sep 10, 2025

And a bit cleaned up (https://godbolt.org/z/q9fGPj54d):

class A;
using T = void(__attribute__((swift_async_context)) A *);
T R;
class A;
void fa(__attribute__((swift_async_context)) A *);

bool f() {
  return R == fa;
}
clang++: /root/llvm-project/llvm/tools/clang/lib/AST/ASTContext.cpp:14198: clang::QualType getCommonNonSugarTypeNode(const clang::ASTContext&, const clang::Type*, clang::Qualifiers&, const clang::Type*, clang::Qualifiers&): Assertion `EPIX.ExtParameterInfos == EPIY.ExtParameterInfos' failed.
...

@mizvekov
Copy link
Contributor Author

FWIW that crash is pre-existing, this PR just adds another way to reproduce it.

Here is a reproducer for clang-21, using Typedefs instead of RecordTypes to induce the problem: https://compiler-explorer.com/z/E89P31brT

class A;
using B = A;
using T = void(__attribute__((swift_async_context)) B *);
T R;

using B = A;
void fa(__attribute__((swift_async_context)) B *);

bool f() {
  return R == fa;
}

mizvekov added a commit that referenced this pull request Sep 10, 2025
This fixes an assumption that the ExtInfo for two same function types
would have referential equality.

This should compare these ExtInfos by value instead.

The bug is pre-existing to #147835,
but that PR adds another way to reproduce it.
mizvekov added a commit that referenced this pull request Sep 10, 2025
)

This fixes an assumption that the ExtInfo for two same function types
would have referential equality.

This should compare these ExtInfos by value instead.

The bug is pre-existing to
#147835, but that PR adds
another way to reproduce it.
@mizvekov
Copy link
Contributor Author

@joanahalili @alexfh that's fixed in #157925

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Sep 10, 2025
…tInfo (#157925)

This fixes an assumption that the ExtInfo for two same function types
would have referential equality.

This should compare these ExtInfos by value instead.

The bug is pre-existing to
llvm/llvm-project#147835, but that PR adds
another way to reproduce it.
@zmodem
Copy link
Collaborator

zmodem commented Sep 11, 2025

On a positive note, we just updated Chromium's toolchain and saw a 5% reduction in build time, which I'm assuming is due to this change :-)

mizvekov added a commit that referenced this pull request Oct 3, 2025
This fixes a regression introduced in #147835

When parsing a lambda where the call operator has a late parsed attribute,
we would try to build a 'this' type for the lambda, but in a lambda 'this'
never refers to the lambda class itself.

This late parsed attribute can be added implicitly by the -ftrapping-math flag.

This patch patch makes it so CXXThisScopeRAII ignores lambdas.

This became observable in #147835 because that made clang lazily create
tag types, and it removed a workaround for a lambda dependency bug
where any previously created tag type for the lambda is dicarded after
its dependency is recalculated.

But the 'this' scope created above would defeat this laziness and create
the lambda type too soon, before its dependency was updated.

Since this regression was never released, there are no release notes.

Fixes #161657
mizvekov added a commit that referenced this pull request Oct 3, 2025
This fixes a regression introduced in #147835

When parsing a lambda where the call operator has a late parsed
attribute, we would try to build a 'this' type for the lambda, but in a
lambda 'this' never refers to the lambda class itself.

This late parsed attribute can be added implicitly by the
-ftrapping-math flag.

This patch patch makes it so CXXThisScopeRAII ignores lambdas.

This became observable in #147835 because that made clang lazily create
tag types, and it removed a workaround for a lambda dependency bug where
any previously created tag type for the lambda is discarded after its
dependency is recalculated.

But the 'this' scope created above would defeat this laziness and create
the lambda type too soon, before its dependency was updated.

Since this regression was never released, there are no release notes.

Fixes #161657
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
…161765)

This fixes a regression introduced in llvm#147835

When parsing a lambda where the call operator has a late parsed
attribute, we would try to build a 'this' type for the lambda, but in a
lambda 'this' never refers to the lambda class itself.

This late parsed attribute can be added implicitly by the
-ftrapping-math flag.

This patch patch makes it so CXXThisScopeRAII ignores lambdas.

This became observable in llvm#147835 because that made clang lazily create
tag types, and it removed a workaround for a lambda dependency bug where
any previously created tag type for the lambda is discarded after its
dependency is recalculated.

But the 'this' scope created above would defeat this laziness and create
the lambda type too soon, before its dependency was updated.

Since this regression was never released, there are no release notes.

Fixes llvm#161657
MixedMatched pushed a commit to MixedMatched/llvm-project that referenced this pull request Oct 3, 2025
…161765)

This fixes a regression introduced in llvm#147835

When parsing a lambda where the call operator has a late parsed
attribute, we would try to build a 'this' type for the lambda, but in a
lambda 'this' never refers to the lambda class itself.

This late parsed attribute can be added implicitly by the
-ftrapping-math flag.

This patch patch makes it so CXXThisScopeRAII ignores lambdas.

This became observable in llvm#147835 because that made clang lazily create
tag types, and it removed a workaround for a lambda dependency bug where
any previously created tag type for the lambda is discarded after its
dependency is recalculated.

But the 'this' scope created above would defeat this laziness and create
the lambda type too soon, before its dependency was updated.

Since this regression was never released, there are no release notes.

Fixes llvm#161657
@emaxx-google
Copy link
Contributor

Hello, did these changes modify the semantics of NestedNameSpecifierLoc::getLocalSourceRange()? For a code like void n::C::f() {}, previous versions of Clang returned C from getLocalSourceRange(), but now it's returning n::C. The new behavior seems to contradict the documentation of this method.

@mizvekov
Copy link
Contributor Author

mizvekov commented Oct 9, 2025

Hello, did these changes modify the semantics of NestedNameSpecifierLoc::getLocalSourceRange()? For a code like void n::C::f() {}, previous versions of Clang returned C from getLocalSourceRange(), but now it's returning n::C. The new behavior seems to contradict the documentation of this method.

I think the documentation is right and matches what's happening.
It's just that what's considered local has changed in the case of a NestedNameSpecifier which is a Type.
The type was previously split-up, each chunk being its own nested name specifier.
Now a Type is always a NestedNameSpecifier component by itself.

Sounds like you want to get the beginning of the range as something like:

if (auto TL = NNSL->getAsTypeLoc())
  return TL.getNonPrefixBeginLoc();
else
  return NNSL->getLocalBeginLoc();

I didn't add a helper for this because there was just a single user in the llvm codebase using this pattern.
See the clang-tools-extra/clangd/FindTarget.cpp change from this patch.

@hokein
Copy link
Collaborator

hokein commented Oct 9, 2025

I think the documentation is right and matches what's happening.
It's just that what's considered local has changed in the case of a NestedNameSpecifier which is a Type.
The type was previously split-up, each chunk being its own nested name specifier.
Now a Type is always a NestedNameSpecifier component by itself.

Just my two cents.

I think the term "local" is ambiguous, and my mental model has always been that it refers to the last specifier in a qualified name. This seems to have been the behavior, and the public API doc implies it.

This recent change seems to break that model and introduces behavior that feels inconsistent:

  • For namespace specifier ns1::ns2::, getLocalSourceRange returns the range of "ns2::";
  • For type specifier ns1::C1::, getLocalSourceRange returns the range of "ns1::C1::";

This difference is confusing. While I understand the reasoning behind the internal implementation, API users aren't likely to be aware of it, and the "last piece" model seems more intuitive.

If this new behavior is intended, could we update the comments for getLocalSourceRange() to clarify these new rules and provide examples for each case? or even would it make sense to preserve the old behavior for the sake of API consistency?

@mizvekov
Copy link
Contributor Author

Yeah, all except one user of getLocalSourceRange expected the behavior to be what it is now.

I propose adding a note about this in the documentation, add a new helper which does what getLocalSourceRange used to do, and migrate the single existing user to use that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 backend:AMDGPU backend:ARC backend:ARM backend:CSKY backend:Hexagon backend:Lanai backend:loongarch backend:MIPS backend:PowerPC backend:RISC-V backend:Sparc backend:SystemZ backend:WebAssembly backend:X86 clang:analysis clang:as-a-library libclang and C++ API clang:bytecode Issues for the clang bytecode constexpr interpreter clang:codegen IR generation bugs: mangling, exceptions, etc. clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang:openmp OpenMP related changes to Clang clang:static analyzer clang Clang issues not falling into any other category clang-tidy clang-tools-extra clangd coroutines C++20 coroutines debuginfo HLSL HLSL Language Support libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. lldb
Projects
None yet